From PropBank to EngValLex: Adapting the PropBank-Lexicon to the Valency Theory of the Functional Generative Description

نویسنده

  • Silvie Cinková
چکیده

EngValLex is the name of an FGD-compliant valency lexicon of English verbs, built from the PropBank-Lexicon and following the structure of Vallex, the FGD-based lexicon of Czech verbs. EngValLex is interlinked with the PropBank-Lexicon, thus preserving the original links between the PropBank-Lexicon and the PropBank-Corpus. Therefore it is also supposed to be part of corpus annotation. This paper describes the automatic conversion of the PropBank-Lexicon into Pre-EngValLex, as well as the progress of its subsequent manual refinement (EngValLex). At the start, the Propbank-arguments were automatically re-labeled with functors (semantic labels of FGD) and the PropBank-rolesets were split into the respective example sentences, which became FGD-valency frames of PreEngValLex. Human annotators check and correct the labels and make the preliminary valency frames FGD-compliant. The most essential theoretical difference between the original and EngValLex is the syntactic alternations used by the PropBank-Lexicon, not yet employed within the Czech framework. The alternation-based approach substantially affects the conception of the frame, making in very different from the one applied within the FGD-framework. Preserving the valuable alternation information required special linguistic rules for keeping, altering and re-merging the automatically generated preliminary valency frames.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Constructing An English Valency Lexicon

This paper presents the English valency lexicon EngValLex, built within the Functional Generative Description framework. The form of the lexicon, as well as the process of its semi-automatic creation is described. The lexicon describes valency for verbs and also includes links to other lexical sources, namely PropBank. Basic statistics about the lexicon are given. The lexicon will be later used...

متن کامل

Czech-English Bilingual Valency Lexicon Online

We describe CzEngVallex, a bilingual Czech–English valency lexicon which aligns verbal valency frames and their arguments. It is based on a parallel Czech-English corpus, the Prague Czech-English Dependency Treebank (PCEDT), where for each occurrence of a verb, a reference to the underlying Czech and English valency lexicons (PDT-Vallex and CzEngVallex, respectively) is recorded. The CzEngValle...

متن کامل

Integrating Generative Lexicon and Lexical Semantic Resources

In this tutorial, we demonstrate how elements of Generative Lexicon Theory (GL) can be used to help enrich both established and developing lexical and computational semantic resources within the CL community. This includes lexicons, ontologies, annotation schemes, and annotated corpora (WordNet, VerbNet, PropBank, FrameNet, AMR, GMB, SIMPLE, and others). The tutorial is organized into two parts...

متن کامل

An Approach to Take Multi-Word Expressions

This research discusses preliminary efforts to expand the coverage of the PropBank lexicon to multi-word and idiomatic expressions, such as take one for the team. Given overwhelming numbers of such expressions, an efficient way for increasing coverage is needed. This research discusses an approach to adding multiword expressions to the PropBank lexicon in an effective yet semantically rich fash...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006